DOCUMENT RESUME 

ED 322 181 TM 015 334 



AUTHOR Gafni, Naomi; Estela, F blamed 

TITLE Differential Tendencies To Guess as a Function of 

Gender and Lingual-Cultural Reference Group. 
PUB DATE 88 
NOTE 16p. 

PUB TYPE Reports - Research/Technical (143) 

EDRS PRICE MF01/PC01 Plus Postage. 

DESCRIPTORS College Bound Students; "College Entrance 

Examinations; Comparative Testing; ^Cultural 

Differences; Foreign Countries; *Guessing (Tests); 

Higher Education; High Schools; 

Languages; Predictor Variables; 

(Tests); *Sex Differences; Test Format; Testing 

Problems; *Test Wiseness 
IDENTIFIERS Israel 



High Srhool Students; 
'Response Style 



ABSTRACT 

The objective of this study was to investigate 
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examination year. The Psychometric Entrance Test (PET) for 
universities in Israel was used, which is administered in Hebrew, 
Arabic, English, French, Spanish, and Russian. The PET is a battery 
of five subtests, and encompasses about 200 test items. Three of the 
five subtests were used in this study; figural reasoning, 
mathematical reasoning, and English. Data for 12,440 male and 10,532 
female examinees were analyzed. The tendency to avoid guessing was 
measured by the proportion of two types of unanswered items: 
unreached items, and omitted items. A factor analysis using VAEIMAX 
rotation indicated a strong two-factor structure, in which all 
indices based on omitted items loaded on the first factor and all 
indices based on unreached items loaded on he second factor. An 
analysis of covariance with the corrected-f cr-gues.. Ing scores as a 
covariate indicated that ail three effects and paired interactions 
were significant. The examination year appeared to have the strongest 
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gender effect was not found to be as strong as were the other two 
main effects. Three data tables are included. (RLC) 
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Abstract 

The objective of this study was to investigate differential 
tendencies to avoid guessing as a function of three variables: 
lingual-cultu^al-group, gender, and examination year. The Psychometric 
Entrance Test (including five subtests) to the universities in T srae l, 
administered in five different languages, was used as the instrument. 
The tendency to avoid guessing was measured by the proportion of two 
types of unanswered items: ur .aached items and omitted items. 

A factor analysis with VARIMAX rotation indicated a strong two- 
factor structure, where all indices based on omitted items loaded on 
the *irst factor and all indices based on unreached items loaded on the 
second factor. WCOVA with the corrected-for-guessing scores as a 
covariate indicated that all three effects and paired interactions were 
significant. Exam year appeared to have the strongest effect on the 
proportion of omitted items, while language- vers ion seemed to affect 
the proportion of unreached items most strongly. The gender effect was 
not found to be as strong as the other two main effects. 



ERIC 



2 



3 

The National Institute for Testing and Evaluation (NITE) tests 
approximately 40,000 applicants for the universities and colleges in 
Israel every year, by means of the Psychometric Entrance Test (PET). 
The test is constructed in Hebrew but approximately twelve percent of 
the applicants are not native Hebrew speakers; as a result, NITE faces 
the unique challenge of translating the test into several languages and 
then equating the scores of the translated versions to the original 
Hebrew form. The largest non-Hebrew-speaking minority taking a 
translated version of PET is the Arabic-speaking population. Other 
languages into which PET is translated are: English, French, Spanish, 
and Russian. 

No correction for guessing is used in scoring the exam, and 
examinees are encouraged to guess when they do not know the correct 
answer. Yet only 75* to 93* of the examinees (depending on the 
specific subtest in the PET battery) respond to all the items in the 
test. Since PET scores are used for admissions decisions, differences 
in scores that result from differential tendencies to guess, however 
small or rare, are perceived to be important. 

The literature on guessing behavior has chiefly dealt with three 
issues: (1) the relative advantages and disadvantages of different 
kinds of test instructions (e.g., Angoff and Schrader, 1981; Cronbach, 
1970: Swineford and Miller, 1953); (2) individual differences in 
guessing as related to personality traits such as "gambling tendency" 
(Swineford, 1938), cautiousness (Gulliksen, 1950) and risk taking 
perception (Cohen, 1 9 60) ; and ( 3 ) group differences (e.g., gender 
differences; Sini, 1984). 

Differsnt indices were used to measure guessing (Angoff & 
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Schrader, l 9 8l; Slakter, 1967 ; Swineford, 1938; Zlller, 1 957 ) in some 
of which the proportion of unanswered items was considered an 
indication of the tendency to avoid guessing. In this study two types 
of unanswered items were considered: unreached items and omitted 
items. Unreached items are identified by working from the end of an 
examinee's response string toward the begirning, taking unanswered 
items as unreached until an answer is encountered. Unanswered items 
preceding this last answered item are considered intentionally omitted 
items (Mislevy, 1988) . 

The objective of this study was to investigate differential 
tendencies to avoid guessing (as defined above) as a function of the 
following factors: 

a) Language-group. Since the different language-groups taking PET 
come from different countries and cultures, it was postulated that 
they might manifest different guessing behaviors. For example, it 
was expected that the English-speaking group would be more 
familiar with multiple-choice tests and, therefore, would be 
likely to closely follow the test instructions. On the other hand, 
the Russian-speaking group, less acquainted with this type of 
test, might be less inclined to guess. 

b) Gender. Previous findings supported the existence of differences 
m guessing behavior between the two genders (Sini, 1985 ; 
Slakter, Koehler, Hampton & Grennel, 1971; Swineford, 1 9 *U) and 
related them to differences between the two groups in personality 
traits such as tendency for risk taking (Keinan, Meir & Qome- 
Nemirovsky, 1984; Slakter, 1967). 

c) Examination date. Since the first year in which NITE administered 
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its exams (198*0, various evidence had accumulated, suggesting 
that the population of examinees had become more familiar with the 
test and was better prepared for it. This increas ng familiarity 
with the test format was. considered likely to reduce the effect of 
personality tendencies, so that current examinees would be more 
inclined to profit from the test directions and omit a smaller 
number of items than past examinees. 

Method 

Instrument 

PET is a battery of five subtests, encompassing about 200 test 
items. Three of the five subtests, which were identical for the 
different language- versions, were used for this study. There were 
slight differences in the number of items included in each of the PET 
forms used in the study. The three subtests were: 

1. Figural Reasoning (F0), which contained between 25 to 27 items. 

2. Mathematical Reasoning (MA), which contained between 30 to 35 
items. 

3« English (EN), which contained between 48 to 50 items, was a test 
of English as a foreign language and included reading 
comprehension, sentence completion and restatement items. 

Each language-version was essentially a translation of the Hebrew 
version. The F0 and EN subtests were identical for all language- 
versions. The MA subtest was translated and reviewed by bilingual 
experts. Scores on the different language- vers ions were equated to 
those on the Hebrew version using F0 and MA as anchor tests. Each sub- 
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test was scored separately on a scale of mean - 100 and standard 
deviation - 20. The final PET score was a simple mean of the five 
subtests, and was reported on a scale with mean - 500 and standard 
deviation - 100. 
Subjects 

Table 1 presents the distribution of examinees by language-version 
and gender, for Form 1 which was administered in all languages in 1984, 
Form 17 which was administered in Arabic and Hebrew in April 1987. and 
Form 18 which was administered in five languages (except for Arabic) in 
April 1987. 

[Insert Table 1 about here] 

Procedure 

Two indices were used: the proportion of unreached items and the 
proportion of omitted items as defined above. In order to examine the 
question of whether the two types of unanswered items indicated two 
separate tendencies, a factor analysis was run on each of the within 
group correlation matrices of the ten variables (two indices by five 
subtests, including a General Knowledge - GK and an Analytical 
Reasoning - RE - subtests, which were not included in the other 
analyses because they were not identical for all languages) . 

For each of the fix dependent variables (two indices x three 
subtests) a covariance analysis (using the SAS procedure GLM) was 
performed with language group, gender, and exam date as independent 
variables, and with the formula score as a covariate (adjusted for the 
slightly different number' of items in the different test forms). This 
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score was preferred over the number right score because u _ ^ 
that it would aoderate the confounding of nuaber-right score with the 
Proportion of iteas (. si „ ilar malysls „„ ^ 

the scaled nuaber-right score as the covariate. yielded highly siailar 



results) . 



Results and Discussion 

Factor Analysis 



rne factor analysis (rotated by VARIMAX aethod) iadicated a 
strong two-factor structure. where all indices based on oaitted iteas 
.ere found loaded on the first factor and those indices based on 
unreached iteas loaded on the second factor, «. structure .as evident 
^r all language-versions by gender groups. Table 2 presents the 
rotated factor pattern aatri* for the Hebrew-speaaing aales eaaained in 
For. 17i slmllar matricM _ ^ ^ ^ structure 

is an indication of two distinct tendencies: one. a tendency to oait 

items when being uncertain n f 

S uncertain of the correct answer; the other, a 

disposition to randomly «rup«,e ««. «.u j « 

aomiy guess at the end of a subtest when unable to 

try all items, due to a time limit. 

[Insert Table 2 about here] 

Languaffe-fl roup gffecf: 

Table 3 presents the aeans and standard deviations of the 
Proportions of oaitted and unreached iteas on the three subtests, for 
->* six language groups and the two genders, exaained in 1 9 84 and 1987 

A language-group effect was found for both types of unanswered 
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items, and in particular, for the proportion of unreached items. 
Russian-, Arabic- and French-speaking examinees tended to omit more 
items than Hebrew-, English- and Spanish-speaking examinees in 1984; in 
1987 Russian-speaking examinees tended to omit more items than all 
other groups. The Russian-speaking group tended not to answer (i.e., 
did not reach) more items at the end of subtests than the other 
groups, in both 1984 and 1987. It is worth noting that while for the 
Hebrew- and Russian-speaking examinees the proportion of unreached 
items was larger than the proportion of omitted items, for the Arabic- 
speaking examinees the proportion of omitted items was larger. For the 
other groups, no such difference was found. 

[Insert luble 3 about here] 

Gender Effect 

A gender effect and an interaction effect of gender with language- 
version and with examination date were found for all variables, 
indicating a smaller difference between the two gender groups in 1987* 
A gender effect was found for Form 17. mainly due to the gender 
differences found for the Arabic -speaking group, for both types of 
unanswered items. An interaction effect of language- version and gender 
was found on this form for the unreached items indices only, where the 
difference between the two gender groups was larger for the Arabic 
version. Although significant, this effect was not found to be as 
strong as the other two main effects (language group and examination 
date) or as the interaction effects. 
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Examination Date Effect 

The proportion of both types of unanswered items significantly 
dropped from l 9 84 to 1 9 8 7 . These results were attributed to an 
intensive educational operation being implemented among the candidates 
with respect to test preparation. An interaction effect was found for 
exam date with language group. While the Arabic -speaking examinees 
tended to omit and not-reach more items than the Hebrew-speaking 
examinees in 1 9 8H, they tended to answer more items (guess more) than 
their Hebrew-speaking counterparts in 1987 (the scores of the Arabic- 
speaking examinees were about one standard deviation lower than those 
of the Hebrew-speaking counterparts in both years). 
Summary 

The difference between the tendencies to omit and to "unreach" 
items has been aJ ready recognized: for example, in the content of item 
parameter estimation (e.g., Mislevy, 1 9 88; Mislevy 4 Bock, 1 9 86) . There 
is also some preliminary evidence suggesting that the two different 
types of "missing values- affect the assessment of test dimensionality 
(Ben-Simon 4 Cohen, personal communication). The results obtained in 
this study support the idea that omitting items and not-reaching items 
are two separate phenomena. They also suggest that guessing behavior 
can be taught but not entirely eliminated. Exam date appeared to have 
the strongest effect on the proportion of omitted items, while 
language-version seemed to affect the proportion of unreached items 
most strongly. In addition, exam date had a greater effect on the 
proportion of omitted items, than on the proportion of unreached items, 
indicating that it is probably easier to train people not to leave an 
item unanswered if it has already been tried, than to train them to 
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randomly guess untried items at the end of the test. 

People with differing cultural backgrounds, as well as the two 
genders, differ in their tendency to guess. It is probable that some of 
the lower scores of certain groups on multiple choice tests can be 
partially explained by their tendency to avoid guessing, as are some of 
the differences in performance among the language groups. It is 
recommended to emphasize the importance of test directions, in 
particular, among members of groups known to avoid guessing. 
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Table 1 

Frequencies of Examinees by Version and Gender 



Test Form 



Tor* 1 Form 17 Form 18 

Language Hales Females Males Females Kales Females 



Hebrew 


520 


467 


3786 


38i5 


2810 


2716 


Arabic 


1876 


839 


2383 


1460 






English 


121 


225 






154 


183 


French 


82 


132 






94 


137 


Spanish 


123 


146 






367 


304 


Russian 


63 


52 






61 


56 
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Table 2 

Rotated factor Pattern Matrix for Proportion 
of emitted (denoted by suffix) and 
Drueached Items (denoted by suffix) for 
five Subtests in for* 17 for Male Hebrew Examinees 



Subtest 


Factor 1 


Factor 2 


RE-R 


.754 


.190 


MA-K 


.707 


.103 


FO-R 


.643 


.209 


EN-R 


.625 


.090 


GK-R 


.426 


.063 


Mi -A 


.095 


.764 


FO-A 


.094 


.602 


RE-A 


.080 


.589 


£N-A 


.117 


.471 


SK-A 


.171 


.451 
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Table 3 

Means and Standard Deviations of the Proportion of Omitted and Unreached Items (in percentages) 
for the Three Subtests Separately and Together for 1984 Examinees (Form 1) and 
for 1987 Examinees (Forms 17 and lb) 



Language Version 



Hebrew 



Arabic 



English 



French 



Spanish 



Russian 



Variable 



Hales 
Form X SO 



Females 
X SD 



Hales 
X SD 



Females 
X SD 



* "A" denotes the proportion of omitted items 
** "R" denotes the proportion of unreached items 
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Hales 
X SD 



FO-A* 


1 


0.9 


(3.2) 


1.6 


(4.3) 


1.8 


(A V\ 




[l.b) 


1.0 




18 


0.7 


(2.5) 


1.1 


(3.7) 










0.6 


HA-i 


17 


0.7 


(2.5) 


1.1 


(3.5) 


O 7 


tj &\ 


n Q 


[3.1) 




1 


1.1 


(4.4) 


3.1 


(7.5) 


7 fi 

it* V 




O- 0 


(10.^) 


1.8 




18 


0.9 


(3.0) 


1.8 


(5.2) 










0.5 




17 


0.9 


(3.0) 


1.8 


(5.0) 


n ft 


O V\ 
[£.?) 


1 A 
1.4 


(4.3) 




EN-A 


1 


0.7 


(2.8) 


1.8 


(5.4) 


1 7 


[0*1) 


1 Q 


/it 1 \ 
(8.1) 


0.3 




18 


0.7 


(2.9) 


1.4 


(4.8) 










0.1 




17 


n 7 


[2.V) 


1.3 


(4.4) 


0.7 


(2.5) 


1.4 


(5.0) 




Total 


1 


0.9 


(2.8) 


2.1 


(4.8) 


2.0 


(4.4) 


4.4 


(7.4) 


0.9 




18 


0.8 


(2.1) 


1.5 


(3.7) 










0.3 




17 


0.8 


(2.2) 


1.4 


(3.4) 


0.7 


(2.0) 


1.3 


(3.7) 




FO-R** 


1 


2.2 


(6.4) 


3.8 


(8.7) 


0.9 


(*.0) 


1.2 


(4.5) 


2.5 




18 


1.8 


(b.5) 


2.5 


(7.5) 










1.6 


MA-R 


17 


1.6 


(5.7) 


3.1 


(3.5) 


0.5 


(3.4) 


0.6 


(2.8) 




1 


1.9 


(7.0) 


4.1 (10.1) 


1.1 


(4.9) 


1.8 


(6.6) 


1.7 




18 


0.9 


(4.4) 


1.5 


(5.9) 










0.6 




17 


0.9 


(4.4) 


2.3 


(7.5) 


0.5 


(3.2) 


0.8 


(3.S) 




EN-R 


1 


2.2 


(7.2) 


5.0 (10.7) 


1.6 


(6.6) 


4.3 


(10.7) 


0.9 




18 


1.7 


(6.8) 


2.7 


(8.5) 










0.1 




17 


1.2 


(5.5) 


2.7 


(8.7) 


0.6 


(4.1) 


1.1 


(4.8) 




Total-R 


1 


2.1 


(6.0) 


4.5 


(8.7) 


1.3 


(4.4) 


2.8 


(6.8) 


1.5 




18 


1.4 


(4.8) 


2.3 


(6.1) 








0.7 




17 


1.2 


(4.2) 


2.7 


(6.9) 


0.5 


(3.1) 


0.9 


(3.0) 





(3-e; 

(2.4) 

(5.2) 
(1.9) 

(1.3) 
(0.3) 

(2.2) 
(1.1) 



(6.9) 
(6.2) 

(5.1) 
(3.7) 

(5.5) 
(1.5) 

(4-2) 
(2.5) 



Females 
X SD 



Hales 
X SD 



Females 
X SD 



Hales 
X SD 



FemaJ.es 
X SD 



Hales 
X SD 



Females 
X SD 



1.1 
0.9 

2.4 
1.1 

0.2 
0.1 

1.0 
0.6 



(3.5) 
(3.0) 

(5.8) 
(4.4) 

(1.2) 
(0.8) 

(2.5) 
(1.9) 



2.5 
2.2 

2.4 
0.6 



(7.1) 
(7.0) 

(7.1) 
(3.0) 



0.2 (2.0) 
0.0 (0.1) 



1.4 

0.7 



(3.5) 
(2.1) 



1.6 
0.3 

2.9 
0.6 

1.3 
0.9 

1.8 
0.7 



(4.1) 
(1.1) 

(7.6) 
(2.1) 

(3.4) 
(4.1) 

(3.9) 
(2.3) 



1.9 
0.7 

1.3 
0.5 



(6.1) 
(4.7) 

(6.7) 
(3.1) 



2.2 (6.1) 
0.3 (2.5) 



1.9 
0.5 



(4.7) 
(3.1) 



1.7 
0.7 

4.4 

0.9 

0.6 
0.8 

1.9 
0.8 



3.4 
0.9 



(4.0) 
(2.5) 

(8.7) 
(4.9) 

(1.7) 
(3.6) 

(3.3) 
(3.3) 



(8.4) 
(4.8) 



4.9 (11.5) 
0.8 (3.7) 



2.6 
0.7 



0.8 



(6.9) 
(4.4) 

17.2) 
(3.1) 



0.7 
0.5 

1.2 
0.5 

0.8 
0.5 

0.9 
0.5 



(1.9) 
(1.7) 

(2.8) 
(1.6) 

(2.5) 
(2.8) 

(1.8) 
(1.6) 



1.0 
1.0 

0.3 
0.5 



(5.0) 
(5.1) 

(3.6> 
(3.4) 



1.0 (5.1) 
0.6 (5.3) 



0.8 
0.7 



(3.9) 
(3.3) 



2.1 
0.7 

4.0 
1.7 

0.9 
0.8 

2.1 
1.0 



1.8 
1.4 

2.1 
1.3 



(4.1) 
(2.6) 

(7.1) 
(5.0) 

(1.9) 
(2.7) 

(3.1) 
(2.7) 



(5.6) 
(5.4) 

(7.7) 
(4.9) 



1.6 (5.1) 
1.2 (5.7) 



1.8 
1.3 



;4.9) 
(4.2) 



0.8 
0.7 

2.3 
1.1 

2.9 
1.7 

2.2 
1.3 



(2.4) 
(2.0) 

(5.9) 
(2.7) 

(8.3) 
(4.9) 

(5.7) 
(2.8) 



3.7 (10.7) 
3.3 (9.7) 



3.2 
2.4 



(9.8) 
(7.8) 



4.8 (12.2) 

2.9 (8.6) 

4.1 (10.2) 
2.9 (5.9) 



1.1 
1.0 

4.6 
1.3 

2.3 
1.9 

2.7 
1.5 



(2.9) 
(3.3) 

(9.6) 
(4.6) 

(6.1) 
(6.2) 

(5.3) 
(4.5) 



7.1 (13.7) 
4.0 (9.8) 

7.8 (14.7) 
3.7 (9.8) 

9.5 (14.4) 

5.9 (14.7) 

8.4 (11.1) 
4.7 (9.3) 
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